{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "## How to export thousands of image chips from Earth Engine in a few minutes\n",
    "\n",
    "This source code of this notebook was adopted from the Medium post - [Fast(er) Downloads](https://gorelick.medium.com/fast-er-downloads-a2abd512aa26) by Noel Gorelick. Credits to Noel.  \n",
    "\n",
    "Due to the [limitation](https://docs.python.org/3/library/multiprocessing.html) of the [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) package, the functionality of this notebook can only be run in the top-level. Therefore, it could not be implemented as a function under geemap. \n",
    "\n",
    "### Install packages\n",
    "\n",
    "Uncomment the following line to install the required packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install geemap retry"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "### Import libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "import ee\n",
    "import geemap\n",
    "import logging\n",
    "import multiprocessing\n",
    "import os\n",
    "import requests\n",
    "import shutil\n",
    "from retry import retry"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "### Initialize GEE to use the high-volume endpoint\n",
    "\n",
    "- [high-volume endpoint](https://developers.google.com/earth-engine/cloud/highvolume)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "ee.Initialize(opt_url=\"https://earthengine-highvolume.googleapis.com\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "### Create an interactive map"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7",
   "metadata": {},
   "outputs": [],
   "source": [
    "Map = geemap.Map()\n",
    "Map"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "### Define the Region of Interest (ROI)\n",
    "\n",
    "You can use the drawing tools on the map to draw an ROI, then you can use `Map.user_roi` to retrieve the geometry. Alternatively, you can define the ROI as an ee.Geometry as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# region = Map.user_roi\n",
    "region = ee.Geometry.Polygon(\n",
    "    [\n",
    "        [\n",
    "            [-122.513695, 37.707998],\n",
    "            [-122.513695, 37.804359],\n",
    "            [-122.371902, 37.804359],\n",
    "            [-122.371902, 37.707998],\n",
    "            [-122.513695, 37.707998],\n",
    "        ]\n",
    "    ],\n",
    "    None,\n",
    "    False,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "### Define the image source\n",
    "\n",
    "Using the 1-m [NAIP imagery](https://developers.google.com/earth-engine/datasets/catalog/USDA_NAIP_DOQQ)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "image = (\n",
    "    ee.ImageCollection(\"USDA/NAIP/DOQQ\")\n",
    "    .filterBounds(region)\n",
    "    .filterDate(\"2020\", \"2021\")\n",
    "    .mosaic()\n",
    "    .clip(region)\n",
    "    .select(\"N\", \"R\", \"G\")\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "Using the 10-m [Sentinel-2 imagery](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR#bands)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13",
   "metadata": {},
   "outputs": [],
   "source": [
    "# image = ee.ImageCollection('COPERNICUS/S2_SR') \\\n",
    "#             .filterBounds(region) \\\n",
    "#             .filterDate('2021', '2022') \\\n",
    "#             .select('B8', 'B4', 'B3') \\\n",
    "#             .median() \\\n",
    "#             .visualize(min=0, max=4000) \\\n",
    "#             .clip(region)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14",
   "metadata": {},
   "source": [
    "### Set parameters\n",
    "\n",
    "If you want the exported images to have coordinate system, change `format` to `GEO_TIFF`. Otherwise, you can use `png` or `jpg` formats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15",
   "metadata": {},
   "outputs": [],
   "source": [
    "params = {\n",
    "    \"count\": 100,  # How many image chips to export\n",
    "    \"buffer\": 127,  # The buffer distance (m) around each point\n",
    "    \"scale\": 100,  # The scale to do stratified sampling\n",
    "    \"seed\": 1,  # A randomization seed to use for subsampling.\n",
    "    \"dimensions\": \"256x256\",  # The dimension of each image chip\n",
    "    \"format\": \"png\",  # The output image format, can be png, jpg, ZIPPED_GEO_TIFF, GEO_TIFF, NPY\n",
    "    \"prefix\": \"tile_\",  # The filename prefix\n",
    "    \"processes\": 25,  # How many processes to used for parallel processing\n",
    "    \"out_dir\": \".\",  # The output directory. Default to the current working directly\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16",
   "metadata": {},
   "source": [
    "### Add layers to map"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17",
   "metadata": {},
   "outputs": [],
   "source": [
    "Map.addLayer(image, {}, \"Image\")\n",
    "Map.addLayer(region, {}, \"ROI\", False)\n",
    "Map.setCenter(-122.4415, 37.7555, 12)\n",
    "Map"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18",
   "metadata": {},
   "source": [
    "### Generate a list of work items\n",
    "\n",
    "In the example, we are going to generate 1000 points using the stratified random sampling, which requires a `classBand`. It is the name of the band containing the classes to use for stratification. If unspecified, the first band of the input image is used. Therefore, we have toADD a new band with a constant value (e.g., 1) to the image. The result of the `getRequests()`function returns a list of dictionaries containing points."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19",
   "metadata": {},
   "outputs": [],
   "source": [
    "def getRequests():\n",
    "    img = ee.Image(1).rename(\"Class\").addBands(image)\n",
    "    points = img.stratifiedSample(\n",
    "        numPoints=params[\"count\"],\n",
    "        region=region,\n",
    "        scale=params[\"scale\"],\n",
    "        seed=params[\"seed\"],\n",
    "        geometries=True,\n",
    "    )\n",
    "    Map.data = points\n",
    "    return points.aggregate_array(\".geo\").getInfo()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20",
   "metadata": {},
   "source": [
    "### Create a function for downloading image"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21",
   "metadata": {},
   "source": [
    "The `getResult()` function then takes one of those points and generates an image centered on that location, which is then downloaded as a PNG and saved to a file. This function uses `image.getThumbURL()` to select the pixels, however you could also use `image.getDownloadURL()` if you wanted the output to be in GeoTIFF or NumPy format ([source](https://gorelick.medium.com/fast-er-downloads-a2abd512aa26))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "22",
   "metadata": {},
   "outputs": [],
   "source": [
    "@retry(tries=10, delay=1, backoff=2)\n",
    "def getResult(index, point):\n",
    "    point = ee.Geometry.Point(point[\"coordinates\"])\n",
    "    region = point.buffer(params[\"buffer\"]).bounds()\n",
    "\n",
    "    if params[\"format\"] in [\"png\", \"jpg\"]:\n",
    "        url = image.getThumbURL(\n",
    "            {\n",
    "                \"region\": region,\n",
    "                \"dimensions\": params[\"dimensions\"],\n",
    "                \"format\": params[\"format\"],\n",
    "            }\n",
    "        )\n",
    "    else:\n",
    "        url = image.getDownloadURL(\n",
    "            {\n",
    "                \"region\": region,\n",
    "                \"dimensions\": params[\"dimensions\"],\n",
    "                \"format\": params[\"format\"],\n",
    "            }\n",
    "        )\n",
    "\n",
    "    if params[\"format\"] == \"GEO_TIFF\":\n",
    "        ext = \"tif\"\n",
    "    else:\n",
    "        ext = params[\"format\"]\n",
    "\n",
    "    r = requests.get(url, stream=True)\n",
    "    if r.status_code != 200:\n",
    "        r.raise_for_status()\n",
    "\n",
    "    out_dir = os.path.abspath(params[\"out_dir\"])\n",
    "    basename = str(index).zfill(len(str(params[\"count\"])))\n",
    "    filename = f\"{out_dir}/{params['prefix']}{basename}.{ext}\"\n",
    "    with open(filename, \"wb\") as out_file:\n",
    "        shutil.copyfileobj(r.raw, out_file)\n",
    "    print(\"Done: \", basename)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23",
   "metadata": {},
   "source": [
    "### Download images"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "logging.basicConfig()\n",
    "items = getRequests()\n",
    "\n",
    "pool = multiprocessing.Pool(params[\"processes\"])\n",
    "pool.starmap(getResult, enumerate(items))\n",
    "\n",
    "pool.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25",
   "metadata": {},
   "source": [
    "### Retrieve sample points"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "26",
   "metadata": {},
   "outputs": [],
   "source": [
    "Map.addLayer(Map.data, {}, \"Sample points\")\n",
    "Map"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
